Impact of Dietary Shifts on Gut Microbiome Dynamics

Multivariate Insights Using R

R for Bio Data Analysis

Group 16: Eric Torres, Lucia de Lamadrid, Konstantina Gkopi, Elena Iriondo and Jorge Santiago

2024-12-03

Introduction

Our aim:

To study the relationship between the composition of the gut microbiota and factors such as diet and colonisation history.

Materials and Methods

General Workflow

MICROBIOME METADATA:

# A tibble: 6 × 6,701
   Diet Source Donor CollectionMet   Sex     OTU0     OTU1     OTU2     OTU3
  <dbl>  <dbl> <dbl>         <dbl> <dbl>    <dbl>    <dbl>    <dbl>    <dbl>
1     0      0     0             0     0 1.56e-11 4.72e-11 1.23e-11 4.52e-11
2     0      1     0             0     0 2.36e-11 9.53e-11 3.33e-11 2.67e-11
3     0      2     0             1     0 6.77e-11 3.68e-11 8.02e-11 5.49e-11
4     0      2     0             0     0 5.52e-11 9.89e-11 4.58e-11 3.54e-11
5     0      3     0             0     0 5.24e-11 6.34e-11 2.35e-11 7.47e-11
6     0      4     0             1     0 7.67e-11 7.22e-11 5.41e-11 1.20e-11
# ℹ 6,692 more variables: OTU4 <dbl>, OTU5 <dbl>, OTU6 <dbl>, OTU7 <dbl>,
#   OTU8 <dbl>, OTU9 <dbl>, OTU10 <dbl>, OTU11 <dbl>, OTU12 <dbl>, OTU13 <dbl>,
#   OTU14 <dbl>, OTU15 <dbl>, OTU16 <dbl>, OTU17 <dbl>, OTU18 <dbl>,
#   OTU19 <dbl>, OTU20 <dbl>, OTU21 <dbl>, OTU22 <dbl>, OTU23 <dbl>,
#   OTU24 <dbl>, OTU25 <dbl>, OTU26 <dbl>, OTU27 <dbl>, OTU28 <dbl>,
#   OTU29 <dbl>, OTU30 <dbl>, OTU31 <dbl>, OTU32 <dbl>, OTU33 <dbl>,
#   OTU34 <dbl>, OTU35 <dbl>, OTU36 <dbl>, OTU37 <dbl>, OTU38 <dbl>, …

OTU TAXONOMY GLOSSARY:

  OTU.ID  Kingdom        Phylum         Class           Order
1   OTU0 Bacteria                                            
2   OTU1 Bacteria    Firmicutes    Clostridia   Clostridiales
3   OTU2 Bacteria    Firmicutes       Bacilli Lactobacillales
4   OTU3 Bacteria Bacteroidetes Bacteroidetes   Bacteroidales
5   OTU4 Bacteria Bacteroidetes                              
6   OTU5 Bacteria    Firmicutes    Clostridia   Clostridiales
              Family           Genus X X.1
1                                         
2    Ruminococcaceae                      
3    Enterococcaceae    Enterococcus      
4 Porphyromonadaceae Parabacteroides      
5                                         
6                                         

Data Tidying and Filtering

  • Added a SampleID column to uniquely identify each sample.

  • Transformed the dataset from wide to long format for easier analysis.

  • Keeping OTUs contributing up to 95% of cumulative abundance.

  • Replaced the numeric codes with descriptive labels.

# Creation and relocation of SampleID
metadata_df <- metadata_df |>
  mutate(SampleID = row_number()) |>  # Create SampleID from the first column
  relocate(SampleID, 
           .before = everything())  # Move SampleID to the first position

metadata_df_long <- metadata_df |> 
  pivot_longer(
    cols = starts_with("OTU"), 
    names_to = "OTU", 
    values_to = "rel_abundance"
  )

head(metadata_df_long)

# Calculate cumulative contribution
cumulative_otus <- metadata_df_long |>
  group_by(OTU) |>
  summarize(mean_abundance = mean(rel_abundance)) |>
  arrange(desc(mean_abundance)) |>
  mutate(cumulative_abundance = cumsum(mean_abundance) / sum(mean_abundance))

# Filter OTUs contributing to 95% cumulative abundance
otus_to_keep <- cumulative_otus |>
  filter(cumulative_abundance <= 0.95) |>
  pull(OTU)

# Number of OTUs before filtering
n_total_otus <- metadata_df_long |> 
  pull(OTU) |> 
  n_distinct()

# Number of OTUs after filtering
n_filtered_otus <- filtered_metadata |> 
  pull(OTU) |> 
  n_distinct()

filtered_metadata_stricter_label <- filtered_metadata_stricter |> 
  mutate(Diet = case_when(Diet == 0 ~ "LFPP",
                          Diet == 1 ~ "Western",
                          Diet == 2 ~ "CARBR",
                          Diet == 3 ~ "FATR",
                          Diet == 4 ~ "Suckling",
                          Diet == 5 ~ "Human")) |> 
  mutate(Source = case_when(Source == 0 ~ "Cecum1",
                          Source == 1 ~ "Cecum2", 
                          Source == 2 ~ "Colon1", 
                          Source == 3 ~ "Colon2", 
                          Source == 4 ~ "Feces",
                          Source == 5 ~ "SI1",
                          Source == 6 ~ "SI13", 
                          Source == 7 ~ "SI15", 
                          Source == 8 ~ "SI2", 
                          Source == 9 ~ "SI5",
                          Source == 10 ~ "SI9", 
                          Source == 11 ~ "Stomach", 
                          Source == 12 ~ "Cecum")) |> 
  mutate(Donor = case_when(Donor == 0 ~ "HMouseLFPP",
                          Donor == 1 ~ "CONVR", 
                          Donor == 2 ~ "Human", 
                          Donor == 3 ~ "Fresh", 
                          Donor == 4 ~ "Frozen",
                          Donor == 5 ~ "HMouseWestern", 
                          Donor == 6 ~ "CONVD")) |> 
  mutate(CollectionMet = case_when(CollectionMet == 0 ~ "Contents",
                                   CollectionMet == 1 ~ "Scraping")) |> 
  mutate(Sex = case_when(Sex == 0 ~ "Male",
                         Sex == 1 ~ "Female")) 
head(filtered_metadata_stricter_label)

Now, our data is tidy!

# A tibble: 6 × 8
  SampleID Diet  Source Donor      CollectionMet Sex   OTU   rel_abundance
     <dbl> <chr> <chr>  <chr>      <chr>         <chr> <chr>         <dbl>
1        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU6       3.31e-11
2        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU7       5.08e-11
3        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU9       2.57e- 3
4        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU41      7.95e-11
5        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU58      2.53e-11
6        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU77      1.28e- 3

and ready to be augmented…

We will use the OTUs taxonomy file to add columns with the names of phylum and class for each OTU, using left_join.

clean_df_taxonomy <- clean_df |>  
  left_join(otu_df_modified, 
            join_by(OTU == OTU.ID)) |> 
  relocate(Phylum, Class, .after = OTU) 

head(clean_df_taxonomy)
# A tibble: 6 × 10
  SampleID Diet  Source Donor      CollectionMet Sex   OTU   Phylum     Class   
     <dbl> <chr> <chr>  <chr>      <chr>         <chr> <chr> <chr>      <chr>   
1        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU6  Firmicutes Bacilli 
2        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU7  Firmicutes Clostri…
3        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU9  Firmicutes Clostri…
4        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU41 Firmicutes Bacilli 
5        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU58 Firmicutes Clostri…
6        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU77 Firmicutes Clostri…
# ℹ 1 more variable: rel_abundance <dbl>

Results and Discussion

Microbiota composition in terms of phyla in different:

  • Sources and Diet Types

  • Diet and Donor Combination

05

Principal Component Analysis on Phylum-Level Aggregated Microbiome Data

  • The PCA variance table shows that PC1 and PC2 together explain most of the total variance (~60%). Including PC3 and PC4 increases the cumulative explained variance to 90%, capturing most of the dataset’s variability

  • Western diet correlates with a negative PC1 coordinate, which aligns with Western diet observations found on the left of the previous scatter plot.

  • Unclassified phyla present an opposite behavior to the diet variable.

  • Bacteroidetes and Firmicutes also exhibit opposite behaviors, consistent with their biological significance.

Principal Component Analysis on Phylum-Level Aggregated Microbiome Data

  • The clear separation between green and pink points indicates that the microbiome composition is strongly influenced by diet.

  • Samples from the Western diet have distinct characteristics compared to those from the LFPP diet, as reflected in their separation along the principal components.

Analysis of Microbiome Clusters by Donor Groups Using Hierarchical Clustering

# Compute Euclidean distance matrix
dist_matrix <- otu_data_scaled |>
  dist()

# Perform hierarchical clustering
hclust_result <- hclust(dist_matrix, method = "ward.D2")

# Cut dendrogram into 3 clusters
cluster_labels <- cutree(hclust_result, k = 3) |>
  as_tibble() |>
  rename(Cluster = value)
# Perform chi-squared test
chi2_result <- chisq.test(donor_cluster_table)
chi2_result

    Pearson's Chi-squared test

data:  donor_cluster_table
X-squared = 659.91, df = 8, p-value < 2.2e-16
  • Cluster 1 is dominated by HMouseLFPP (55.5%) with notable contributions from Frozen (17.8%) and Fresh (18.7%), reflecting plant-rich diets and preserved samples.

  • Cluster 2 includes mostly Fresh (55.1%) and HMouseLFPP (26.5%), indicating a mix of human-derived and dietary influences.

  • Cluster 3 is almost entirely CONVR (95%), representing natural microbiota from control mice.

  • The chi-squared test confirms significant associations between donor origins and clusters, highlighting the influence of donors on microbiota composition.

Biodiversity and diet

Shannon diversity index

  • Number of species living in a habitat (richness)
  • Relative abundance (evenness).

\[ H' = -\sum_{i=1}^R p_i \ln p_i \\ p_i \text{ is the relative abundance of OTU}_i\\ R \text{ is the total number of OTUs} \]

Biodiversity in the microbiota of first-generation humanized mice was found to differ significantly across different diets.

Conclusion

  • The “Obesity-inducing” diet influences the Firmicutes-Bacteroidetes ratio
  • PCA shows how diet shapes microbial composition, as well as the relationship between different phyla.

  • Clustering shines light on how the microbiota donor structures the data

  • The Western diet favours a more biodiverse gut ecosystem